The data is ketchley_wenig_CPS.dta.

The main results file is ketchley_wenig_CPS.do. 

The R script "appendix_random_forest.R" recovers missingness on additional controls by chaining random forests (Appendix Table A6). Read in the .dta file and then export to .csv and read back in to Stata.  

The R script "appendix_job_titles.R" shows the top 10 terms in job titles (Figure A3) and a document term matrix showing word co-occurance in job titles (Appendix Table A1).